Rank | Count | Beginning |
---|---|---|
1 | 15655 | Ang |
2 | 12863 | Kini |
439 | 331 | Sumala |
154 | 163 | Sa |
29 | 91 | Alang |
1515 | 84 | Si |
281 | 53 | Mapa |
1092 | 33 | Kining |
4182 | 30 | Usa |
255 | 29 | Siya |
415 | 27 | Kon |
797 | 27 | Mao |
1934 | 23 | Apan |
2688 | 19 | May |
2580 | 12 | Ug |
2738 | 12 | Pipila |
130 | 10 | Bisan |
6930 | 9 | Daghan |
2437 | 8 | Binisayang |
6078 | 7 | Nahimutang |
308 | 6 | Mahimo |
328 | 6 | Daghang |
844 | 6 | Niadtong |
2209 | 6 | Human |
3460 | 6 | Anaa |
7373 | 6 | Sila |
59 | 5 | Anak |
1588 | 5 | Template:Mga-dibisyon-sa-pagdumala-sa-pransiya |
1801 | 5 | Mga |
2211 | 5 | Samtang |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV